Data mining applied to linkage disequilibrium mapping.

نویسندگان

  • H T Toivonen
  • P Onkamo
  • K Vasko
  • V Ollikainen
  • P Sevon
  • H Mannila
  • M Herr
  • J Kere
چکیده

We introduce a new method for linkage disequilibrium mapping: haplotype pattern mining (HPM). The method, inspired by data mining methods, is based on discovery of recurrent patterns. We define a class of useful haplotype patterns in genetic case-control data and use the algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association with the phenotype, and all haplotypes exceeding a given threshold level are used for prediction of disease susceptibility-gene location. The method is model-free, in the sense that it does not require (and is unable to utilize) any assumptions about the inheritance model of the disease. The statistical model is nonparametric. The haplotypes are allowed to contain gaps, which improves the method's robustness to mutations and to missing and erroneous data. Experimental studies with simulated microsatellite and SNP data show that the method has good localization power in data sets with large degrees of phenocopies and with lots of missing and erroneous data. The power of HPM is roughly identical for marker maps at a density of 3 single-nucleotide polymorphisms/cM or 1 microsatellite/cM. The capacity to handle high proportions of phenocopies makes the method promising for complex disease mapping. An example of correct disease susceptibility-gene localization with HPM is given with real marker data from families from the United Kingdom affected by type 1 diabetes. The method is extendable to include environmental covariates or phenotype measurements or to find several genes simultaneously.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haplotype-based linkage disequilibrium mapping via direct data mining

MOTIVATION With the availability of large-scale, high-density single-nucleotide polymorphism markers and information on haplotype structures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. RESULTS We present a novel approach for association mapping based on directly mining haplotypes (...

متن کامل

Computational population-based techniques in identifying genetic variants associated with simulated complex disorder in a general population

Several techniques for association analysis have been applied to the simulated data set for the general population (Problem 2 of the Genetic Analysis Workshop 12). We have focused our efforts on the pedigree founders who did not have any living parents. Entire pedigrees were also used in several methods to compare whether the inclusion of the offspring is beneficial. Association methods have be...

متن کامل

On a Family-Based Haplotype Pattern Mining Method for Linkage Disequilibrium Mapping

Linkage disequilibrium mapping is an important tool in disease gene mapping. Recently, Toivonen et al. [1] introduced a haplotype mining (HPM) method that is applicable to data consisting of unrelated high-risk and normal haplotypes. The HPM method orders haplotypes by their strength of association with trait values, and uses all haplotypes exceeding a given threshold of strength of association...

متن کامل

The Pattern of Linkage Disequilibrium in Livestock Genome

Linkage disequilibrium (LD) is bases of genomic selection, genomic marker imputation, marker assisted selection (MAS), quantitative trait loci (QTL) mapping, parentage testing and whole genome association studies. The Particular alleles at closed loci have a tendency to be co-inherited. In linked loci this pattern leads to association between alleles in population which is known as LD. Two metr...

متن کامل

Efficient mining of haplotype patterns for linkage disequilibrium mapping.

Effective identification of disease-causing gene locations can have significant impact on patient management decisions that will ultimately increase survival rates and improve the overall quality of health care. Linkage disequilibrium mapping is the process of finding disease gene locations through comparisons of haplotype frequencies between disease chromosomes and normal chromosomes. This wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • American journal of human genetics

دوره 67 1  شماره 

صفحات  -

تاریخ انتشار 2000